Exploring The Role Of Punctuation In Parsing Natural Text

نویسنده

  • Bernard E. M. Jones
چکیده

Few, if any, current NLP systems make any significant use of punctuation. Intuitively, a treatment of lrunctuation seems necessary to the analysis and production of text. Whilst this has been suggested in the fiekls of discourse strnetnre, it is still nnclear whether punctuation can help in the syntactic field. This investigation atteml)ts to answer this question by parsing some corpus-based material with two similar grammars one including rules for i)unctuation, the other igno,'ing it. The punctuated grammar significantly outq)erforms the unpunctnated on% and so the conclnsion is that punctuation can play a usefifl role in syntactic processing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Towards a Syntactic Account of Punctuation

Little notice has been taken of punctuation in the field of natural language processing, chiefly due to the lack of any coherent theory on which to base implementations. Some work has been carried out concerning punctuation and parsing, but much of it seems to have been rather ad-hoc and performance-motivated. This paper describes the first step towards the construction of a theoretically-motiv...

متن کامل

A System Description of P^4: Possible Punctuation Points Parser

We present a Natural Language Understanding (NLU) implementation that automatically inserts punctuation marks into a sequence of words to create a group of one or more syntactically correct sentences. The software, Possible Punctuation Points Parser (P^4) provides the ability for the user to input a string of words to process, performs the punctuation possibilities, and then provides several vi...

متن کامل

Exploring Features for Identifying Edited Regions in Disfluent Sentences

This paper describes our effort on the task of edited region identification for parsing disfluent sentences in the Switchboard corpus. We focus our attention on exploring feature spaces and selecting good features and start with analyzing the distributions of the edited regions and their components in the targeted corpus. We explore new feature spaces of a partof-speech (POS) hierarchy and rela...

متن کامل

Commas and Spaces: The Point of Punctuation

While it has been widely assumed that punctuation may play a critical role in parsing, there has been relatively little direct empirical investigation of its effects. Most researchers have either avoided the use of punctuation or have simply assumed that it will serve a disambiguating role. There has been little or no consideration of how ’disambiguation’ might occur or whether it is equally ef...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994